Usefulness of Sentiment Analysis

نویسندگان

  • Jussi Karlgren
  • Magnus Sahlgren
  • Fredrik Olsson
  • Fredrik Espinoza
  • Ola Hamfors
چکیده

What can text sentiment analysis technology be used for, and does a more usage-informed view on sentiment analysis pose new requirements on technology development? 1 Human emotion, attitude, mood, affect, sentiment, opinion, and appeal Analysis of sentiment in text is a new and rapidly growing field of study and application. This paper outlines some application areas for sentiment analysis technology, and discusses what requirements a technology for sentiment analysis of text should be able to answer to. The human sensations of emotion, attitude, mood, affect, sentiment, opinion, and appeal all contribute to the basic categories of sentiment analysis of text, but they have been studied in their own right for a long time. Traditionally, this has been done in the behavioural sciences; [9] but today also by information technologists, especially with respect to interaction design . “Emotion” , “attitude” , “mood” “affect”, “sentiment” , and “appeal” are everyday words. No consensus beyond the general vernacular usage of the most common terms can currently be assumed, but mostly the usage tends to hold that affect or affective state is the more general term, emotion a momentary, mostly conscious sensation, and mood an affective frame over a longer time span, not necessarily consciously acknowledged by its holder. These affective aspects of human behaviour and information processing are studied in various ways with variously differing perspectives, but the assumptions of most researchers is that people are in continuously changing affective states of some sort; and that activities people engage in have emotional impact and that their decision making, behaviour, and performance are informed by the affective state of the user. This appears to be true even for very mundane tasks such as workplace tasks or accessing information items, but most importantly for the purposes of this paper, in producing and understanding information items, and, it is assumed, even to the extent that mood, with respect to some topic or facet of life, will colour and influence the understanding, generation, or processing of information on another quite different topic. Sentiment analysis of text typically assumes that lexical items found in the text carry attitudinal loading. Previous work on the loading of individual features and the affective reaction of human subjects to linguistic items on the level of words and terms [16] or still images [15] quite often take “emotion labels” to be given, accepted, and comprehensible to test subjects as a basis for the study of correlation between emotions of various kinds. [13] This is a fairly far-reaching simplified model of human emotion, but human emotion in sentiment analysis of text is typically simplified further to be measurable somewhere on a scale from positive to negative. This paper argues that this simplified knowledge model in fact makes the task of informed sentiment analysis of text more complex than it should be. 2 Sentiment Analysis of Text as an Applied Technology 2.1 Consumer Attitude It is widely understood that word of mouth phenomena play an important role for informing and affecting consumer decisions, and in building and destroying brand reputation. User-generated Internet content such as forums, blogs, and BBSes facilitate this process, and are undermining the authoritative status historically carried by traditional media, especially in markets where the authority and independence of traditional media is low for political or commercial reasons. Applications to address the analysis of consumer mood are drivers for much of the research done in the area of textual sentiment analysis. [10] The application of sentiment analysis to consumer attitude can be viewed from two perspectives: 1. tools for market analysts to refine the offerings produces wish to make available and known to consumers or 2. tools for consumers to mine experiences of peers in face of a challenging purchase decision. In the latter case, consumers might wish to find reviews or comments made about some product or service by others, especially critical ones, and rank them by authoritativeness, reliability and throughness or some other quality criteria. In the former case, producers might be most interested in aggregating the general mood of consumers visavi their product or service. In both cases, a broad recall of opinion is a first requirement for useful exploitation of the technology; a further analysis will benefit from a fine-grained break-down by facet. In the latter case, the hypothesis is that a broad spectrum of peer opinions is a useful supplementary or overriding source of knowledge for making informed purchase decisions. In the former case, the hypothesis is that a broad range of consumer opinions can be mined by running a high-recall sieve over vaguely expressed opinionated text, capturing signal which otherwise might not have been detected. 2.2 Investment Trends Herding behaviour is an attractive model for explaining and modelling certain price movements on the stock market. The recent and emerging research direction to model actors’ tendencies and biases to move in concert or in averting and seeking risk and rewards asymmetrically in accordance with some underlying latent behavioural variables takes as a point of departure that most of the movements that can be observed in trading data have unknown, unknowable or even random causes. [31, 4] The quote: ”... we consider a set of N investors, each of whom has either bullish or bearish opinion ... At every time step each of N investors can change [opinion]. ... The probability of [changing opinion] ... depends only on the bullish sentiment described as the number of bullish investors among the total of N investors. The number of bullish investors then forms a Markov chain ... ” [25] is typical in that the model described may be computationally sophisticated, but that the information sources which cause the processes it describes are viewed to be beyond the scope of the model itself. These from the standpoint of information science rather weak starting points allow for the injection of new information into the predictive models. Sentiment analysis of text has been tested and found to carry some signal in several recent reports [20, 24, 21, 30, 3, 2]. The hypothesis in these cases is one or a combination of the following: 1. the people who trade indicate their preferences and reveal their deliberations in advance of action by writing about them in public fora, or 2. the public sentiment visavi some tradable asset can be found through judicious analysis of public expressions of opinion with respect to this asset and that this sentiment is a fair estimate of the sentiment of traders, or 3. there is such a large volume of high-quality informed analyses available in published media texts that even with a competent topical search engine, no human reader can make sense of it all without an automatic opinion aggregator, or 4. the public mood in general, not necessarily bound to expressions of sentiment visavi some tradable asset, will act as a filter which informs trading decisions traders make with respect to tradable assets of all or alternatively some specific kinds. 2.3 Security Concerns Similarly to market and financial analysts, intelligence and security analysts want to identify and keep track of certain user-initiated discussions and postings on forums, blogs, newsgroups, and other user generated web content. This domain has at least two distinct usage scenarios: 1. tracking public mood to detect and predict security threats or disruptive public behaviour, the hypothesis being that e.g. the tendency for public protest can be monitored and public action may be ignited or catalysed by public communicative behaviour and reflected in the sentiment expressed in public text; and 2. identifying and monitoring certain individuals or certain documents as being threatening, risky, abusive or help-seeking, the hypothesis being that individuals who express threatening or abusive sentiments in public texts can be reliably identified and that bluster can be usefully distinguished from imminent bite through text analysis. The first task is clearly related to the one of tracking public mood in the previous two domains of consumer sentiment and investment mood: a broad-recall tracker to find risk of public sentiment (or the sentiment of some targeted group) boiling over. The second task is a very different task, that of identifying specific sentiments and specific indication of future action in fine-grained analysis of specific texts or text streams. This is not unlike the general task of author profiling, and authorship attribution where the technical hypothesis is that a computational analysis of language and observable linguistic items may be more exact and more revealing than a human reader would be able to achieve. 3 Sentiment Analysis as an Engineering Question Given the above descriptions of information need and potentially lucrative and fruitful application domains for large-scale sentiment analysis of text the application of known text analysis tools from information retrieval would seem to be straightforward. Many recent approaches to sentiment analysis have been patterned on search technology, under the reasonable assumption that most of the attitudinal signal in text is lexical.[17] Under this assumption, the procedure for computing sentiment loading for text is straightforward: occurrences of lexical items can be counted, those occurrence counts aggregated and tabulated, the items weighted according to previously observed occurrences in attitudinally loaded texts and the resulting statistics processed by categorisation algorithms originally developed for lexically based topical categorisation. And the sound quality my God! Raymond left no room for error on his recordings and it shows. Definitely one of the better tracks on the album. Wow, could have been a expansion pack. Table 1. Some benchmark example sentences (From [27]) Benchmarks, which drive research in this area are consumer reviews with a text-level annotation of author attitude and some collections of sentenceor clause-level attitudinal items. [29, 18, 1] Different types of benchmark give rise to different optimisation strategies for the algorithms employed. If the task is understood to be driven by a text level analysis the weighting of lexical items and the ideal categorisation of texts given those items will be different from the task of identifying mentions, clauses, or sentences. The first task — that of classifying texts such as review texts as being positive or negative typically achieves an accuracy of about 70 to 90 per cent, depending on topical area. Agreement for human annotators is quite high. Results for the second task — that of classifying sentences as being positive or negative typically are distinctly lower, about 60 to 80 per cent, as is the agreement among human language annotators. This is not surprising given the open-ended nature of human language — the examples in Table 1 are difficult to assess as being positive or negative without a broader discourse context and knowledge of the expressive habits of the author.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sentiment analysis methods in Sentiment analysis methods in Persian text: A survey

With the explosive growth of social media such as Twitter, reviews on e-commerce website, and comments on news websites, individuals and organizations are increasingly using opinions in these media for their decision making. Sentiment analysis is one of the techniques used to analyze userschr('39') opinions in recent years. Persian language has specific features and thereby requires unique meth...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

Sentiment Analysis of Social Networking Data Using Categorized Dictionary

Sentiment analysis is the process of analyzing a person’s perception or belief about a particular subject matter. However, finding correct opinion or interest from multi-facet sentiment data is a tedious task. In this paper, a method to improve the sentiment accuracy by utilizing the concept of categorized dictionary for sentiment classification and analysis is proposed.  A categorized dictiona...

متن کامل

Text Analytics of Customers on Twitter: Brand Sentiments in Customer Support

Brand community interactions and online customer support have become major platforms of brand sentiment strengthening and loyalty creation. Rapid brand responses to each customer request though inbound tweets in twitter and taking proper actions to cover the needs of customers are the key elements of positive brand sentiment creation and product or service initiative management in the realm of ...

متن کامل

2016 Olympic Games on Twitter: Sentiment Analysis of Sports Fans Tweets using Big Data Framework

Big data analytics is one of the most important subjects in computer science. Today, due to the increasing expansion of Web technology, a large amount of data is available to researchers. Extracting information from these data is one of the requirements for many organizations and business centers. In recent years, the massive amount of Twitter's social networking data has become a platform for ...

متن کامل

A Grouping Hotel Recommender System Based on Deep Learning and Sentiment Analysis

Recommender systems are important tools for users to identify their preferred items and for businesses to improve their products and services. In recent years, the use of online services for selection and reservation of hotels have witnessed a booming growth. Customer’ reviews have replaced the word of mouth marketing, but searching hotels based on user priorities is more time-consuming. This s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012